Evaluating MapReduce for Multi-core and Multiprocessor Systems
ثبت نشده
چکیده
As multi-core chips become ubiquitous, it is critical to develop parallel programming models and runtime systems that can harness their computational capabilities. In this paper, we evaluate the suitability of the MapReduce model for multi-core and multi-processor systems. MapReduce was developed by Google to program and manage data-centers with thousands of servers. It allows programmers to write functional-style code that is automatically parallelized and scheduled on a distributed system. We describe Phoenix, an implementation of MapReduce for shared-memory parallel systems that includes a programming API and an efficient runtime system. The Phoenix runtime automatically manages threadgeneration, dynamic task scheduling, data partitioning, and fault-tolerance across the processor nodes. We evaluate Phoenix with a diverse set of benchmarks on both multi-core and symmetric multiprocessor systems. We demonstrate that Phoenix leads to excellent speedups with both systems types. The speedups are robust across a wide range of system and dataset characteristics. We also show that Phoenix can automatically recover from transient and permanent errors on map and reduce tasks. Finally, we compare to parallel versions of the benchmarks written in the lower-level Pthreads API and demonstrate that despite the overheads associated with the MapReduce model, Phoenix leads to competitive performance with significantly simpler code. Overall, we establish that MapReduce provides a promising method for programming and managing parallel applications on multi-core and other shared-memory systems.
منابع مشابه
Parallel support vector machines on multi-core and multiprocessor systems
This paper proposes a new and efficient parallel implementation of support vector machines based on decomposition method for handling large scale datasets. The parallelizing is performed on the most time-and-memory consuming work of training, i.e., to update the vector f . The inner problems are dealt by sequential minimal optimization solver. Since the underlying parallelism is realized by the...
متن کاملOptimized Runtime Systems for MapReduce Applications in Multi-core Clusters
Optimized Runtime Systems for MapReduce Applications in Multi-core Clusters
متن کاملPre-scheduling and Scheduling of Task Graph on Homogeneous Multiprocessor Systems
Task graph scheduling is a multi-objective optimization and NP-hard problem. In this paper a new algorithm on homogeneous multiprocessors systems is proposed. Basically, scheduling algorithms are targeted to balance the two parameters of time and energy consumption. These two parameters are up to a certain limit in contrast with each other and improvement of one causes reduction in the othe...
متن کاملA Multiprocessor System with Non-Preemptive Earliest-Deadline-First Scheduling Policy: A Performability Study
This paper introduces an analytical method for approximating the performability of a firm realtime system modeled by a multi-server queue. The service discipline in the queue is earliestdeadline- first (EDF), which is an optimal scheduling algorithm. Real-time jobs with exponentially distributed relative deadlines arrive according to a Poisson process. All jobs have deadlines until the end of s...
متن کاملThread-level priority assignment in global multiprocessor scheduling for DAG tasks
The advent of multiand many-core processors offers enormous performance potential for parallel tasks that exhibit sufficient intra-task thread-level parallelism. With a growth of novel parallel programming models (e.g., OpenMP, MapReduce), scheduling parallel tasks in the real-time context has received an increasing attention in the recent past. While most studies focused on schedulability anal...
متن کامل